Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
ReLU°¡ ÇÕ¼ºµÈ Çà·Ä °ö ¿¬»êÀÇ ºÎºÐ »ý·«À» ÅëÇÑ µö ·¯´× ¸ðµ¨ Ãß·Ð ½Ã°£ °³¼± |
¿µ¹®Á¦¸ñ(English Title) |
Improving the Inference Time of the Deep Learning Model with Partial Skip of ReLU-fused Matrix Multiplication Operations |
ÀúÀÚ(Author) |
±è¼º±Õ
¾È°ÇÁÖ
±è³ªÈÆ
¼Áö¿ø
Sungkyun Kim
Gunjoo Ahn
Nahum Kim
Jiwon Seo
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 28 NO. 03 PP. 0139 ~ 0145 (2022. 03) |
Çѱ۳»¿ë (Korean Abstract) |
ÃÖ±Ù µö ·¯´×ÀÇ È°¿ë ºÐ¾ß°¡ ³Ð¾îÁö´Â Ãß¼¼À̸ç, ¸¹Àº ÆĶó¹ÌÅ͵éÀ» °¡Áö°í ÀÖ´Â Large-Scale µö ·¯´× ¸ðµ¨µéÀÌ ÁÁÀº ¼º´ÉÀ» º¸ÀÌ´Â °æÇâÀÌ ÀÖ´Ù. ±×¸®°í Å©±â°¡ Å« ¸ðµ¨À» ÀÌ¿ëÇÑ µö ·¯´× Ãß·ÐÀº ÇÊ¿¬ÀûÀ¸·Î ¸¹Àº ÀÚ¿ø°ú ±ä ½Ã°£À» ¿ä±¸ÇϹǷΠµö ·¯´× ¸ðµ¨ÀÇ È¿À²ÀûÀÎ È°¿ëÀ» À§Çؼ´Â Ãß·Ð ½Ã°£ÀÇ ´ÜÃàÀÌ ÇʼöÀûÀ¸·Î ¿ä±¸µÈ´Ù. º» ³í¹®¿¡¼´Â µö ·¯´× Ãß·Ð °úÁ¤¿¡¼ È°¼ºÈ ÇÔ¼öÀÎ Rectified Linear Unit °úÇà·Ä °öÀ» À¶ÇÕÇÏ°í, µÎ ¿¬»ê°úÁ¤¿¡¼ °è»êÇÒ Ãâ·Â °ªÀÇ ºÎÈ£¸¦ ¹Ì¸® ¿¹ÃøÇÏ¿© °è»êÀÇ ¾çÀ» ÁÙÀÌ´Â ³× °¡Áö ¹æ¹ýÀ» Á¦¾ÈÇϸç, ³× °¡Áö °è»ê »ý·« ¹æ¹ýÀÇ ºñ±³¸¦ ÅëÇØ Á¤È®µµ¸¦ °ÅÀÇ ÇØÄ¡Áö ¾Ê´Â ¼±¿¡¼ °è»êÀÇ ¾çÀ» ÁÙ¿© Ãß·Ð ½Ã°£À» Àý¾àÇÏ´Â ÃÖÀûÀÇ ¹æ¾ÈÀ» µµÃâÇÑ´Ù. |
¿µ¹®³»¿ë (English Abstract) |
Deep learning has expanded its utilization, and large-scale deep learning models containing many parameters tend to perform well. As large-scale models inevitably require many resources and long inference time, reducing the inference time is essential for efficient utilization of deep learning models. We fuse the activation function Rectified Linear Unit and matrix multiplication in the inference process, and reduce the amount of computation by predicting the sign of the output values to be computed in the computational processes. We propose four methods for reducing the computation and derive an optimal method that saves inference time with low accuracy loss by reducing the amount of computation by comparing these four methods. |
Å°¿öµå(Keyword) |
µö ·¯´× ÃÖÀûÈ
°è»ê »ý·«
¿ÏÀü ¿¬°á ·¹À̾î
Ãß·Ð ÃÖÀûÈ
deep learning optimization
omitted computation
fully-connected layer
inference optimization
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|